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Abstract 

Peripheral blood mononuclear cells (PBMCs), including rare circulating stem and progenitor cells (CSPCs), have important 
yet poorly understood roles in the maintenance and repair of blood vessels and perfused organs. Our hypothesis was that 
the identities and functions of CSPCs in cardiovascular health could be ascertained by analyzing the patterns of their co- 
expressed markers in unselected PBMC samples. Because gene microarrays had failed to detect many stem cell-associated 
genes, we performed quantitative real-time PCR to measure the expression of 45 primitive and tissue differentiation markers 
in PBMCs from healthy and hypertensive human subjects. We compared these expression levels to the subjects' 
demographic and cardiovascular risk factors, including vascular stiffness. The tested marker genes were expressed in all of 
samples and organized in hierarchical transcriptional network modules, constructed by a bottom-up approach. An index of 
gene expression in one of these modules (metagene), defined as the average standardized relative copy numbers of 15 
pluripotency and cardiovascular differentiation markers, was negatively correlated (all p<0.03) with age (R 2 =— 0.23), 
vascular stiffness (R 2 =— 0.24), and central aortic pressure (R 2 =— 0.19) and positively correlated with body mass index 
(R 2 = 0.72, in women). The co-expression of three neovascular markers was validated at the single-cell level using mRNA in 
situ hybridization and immunocytochemistry. The overall gene expression in this cardiovascular module was reduced by 
72+22% in the patients compared with controls. However, the compactness of both modules was increased in the patients' 
samples, which was reflected in reduced dispersion of their nodes' degrees of connectivity, suggesting a more primitive 
character of the patients' CSPCs. In conclusion, our results show that the relationship between CSPCs and vascular function 
is encoded in modules of the PBMCs transcriptional network. Furthermore, the coordinated gene expression in these 
modules can be linked to cardiovascular risk factors and subclinical cardiovascular disease; thus, this measure may be useful 
for their diagnosis and prognosis. 
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Introduction 

Circulating stem/progenitor cells (CSPCs) contribute to the 
maintenance of the normal functions of blood vessels and tissues 
and their repair and regeneration [1]. These cells may also 
promote tumor growth by facilitating neovascularization or the 
development of tumor stroma [2]. CSPCs and other leukocytes 
mediate these actions through the release of paracrine factors [3] 
and occasionally by transdifferentiation [4]. The numbers and 
functions of CSPCs are impaired by exposure to cardiovascular 
risk factors, such as aging, diabetes, hyperlipidemia, or hyperten- 
sion (for a review, see [5]). Moreover, the frequency of CSPCs was 



inversely related to subclinical vascular diseases, including 
endothelial dysfunction and arterial stiffness [6]. 

A major obstacle to progress in this field has been a lack of 
consensus regarding the precise molecular markers that define 
these regenerative pathways [7] . This problem is compounded by 
the limited accuracy and reproducibility of the current methods 
used to quantitate CSPCs, such as flow cytometry [8] and in vitro 
colony formation assays for 'early' [9] or 'late' [10] progenitor 
cells. Potential novel tools that may be used to address these issues 
include the emerging network sciences as applied to biology and 
medicine [11]. 
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Transcription progressively enables primitive cells to acquire a 
differentiated phenotype [12], whereas the expression of primitive 
genes is indicative of cell sternness in both bone marrow and blood 
[13]. However, the reason for the presence of mRNA for tissue- 
specific differentiation genes in circulating cells is less clear. 
Because more than 80% of the genes expressed in human 
peripheral blood are also expressed in other body tissues [14], 
mRNA profiling of leukocytes has been proposed as an accessible 
window to the multi-organ transcriptome [15]. Additionally, the 
transcriptional landscape, including those of adult hematopoietic 
stem cells and adult leukocytes, is organized as a modular network 
of co-expressed genes [16]. Cardiovascular disease-associated 
transcriptomic signatures are known to exist in peripheral blood 
[17]; however, none has yet been found to specifically contain 
CSPC markers or be directly relevant to vascular function in 
healthy subjects. 

Our hypothesis was that the origins of primitive and tissue- 
specific mRNAs in peripheral blood mononuclear cell (PBMC) 
samples would be primarily, although not exclusively, in CSPCs. If 
supported by data, then the coordinated expression of CSPC- 
derived mRNAs should be detectable in peripheral blood 
transcriptional profiles and reflect the function of the correspond- 
ing tissues, similar to the actual tissue-specific CSPCs. Here, we 
developed and functionally validated such a method, which applies 
network science to transcriptomic analyses. 

Because the high-throughput charting of a transcriptome either 
produces many irrelevant hits or is often too insensitive for specific 
targets [18], predesigned gene panels are increasingly used for the 
detection of gene expression signatures in tissues [19] and the 
assessment of pluripotency [20] or differentiation hierarchy in 
stem cells [21]. To detect rare transcripts, the most reliable 
technique to date remains quantitative real-time PCR (qRT- 
PCR), which is accurate, precise, more sensitive than microarrays, 
and more specific for mature transcripts than RNA sequencing 
[18]. qRT-PCR has been used to generate transcriptional 
networks from as few as 18 transcription factors [22] to as many 
as 280 of the 'most-used' hand-picked stem cell markers [21]. The 
transcriptional signatures of individual CSPC-associated markers 
have been previously detected using qRT-PCR in human and 
animal peripheral blood at baseline and after major cardiovascular 
and neurovascular events [23]. 

As reported here, microarrays (specifically, Affymetrix Gene- 
Chips) were unable to detect many of the stem cell-related genes in 
PBMCs samples, a finding that confirmed previous studies [21]. 
Thus, to create a sensitive and efficient assay, we designed a gene 
panel that included the most-recognized stem/progenitor cell and 
tissue differentiation markers (Table SI). These markers were 
simultaneously measured in PBMCs using qRT-PCR and then 
subjected to a network analysis and validated for vascular function. 

Methods 

Subjects 

Institutional Review Board (IRB) approvals were obtained from 
The Ohio State University and Emory University, and all subjects 
signed written informed consent forms. We recruited 26 healthy 
volunteers at Ohio State University and 20 hypertensive subjects 
at Emory University (Table 1 and Table 2, respectively). 

Blood pressure and pulse wave measurements 

Brachial blood pressure (BP) and radial pulse wave values were 
measured in the healthy subjects using a manual oscillometric 
monitor with a standard adult cuff and the SphygmoCor device 
(AtCor Medical, Sydney, Australia), respectively. The latter 



Table 1. Characteristics of the control subjects. 





Parameter 


Value 


Percent/range 


Gender 


Males 


12 


46.2% 


Females 


14 


53.8% 


Race 


White 


16 


61 .5% 


African American 


7 


27.0% 


Asian 


3 


1 1 .5% 


Age, years 


40.6 ±11.6 


19-58 


Height (cm) 


171.2±11.4 


152-193 


Weight (kg) 


83.2±21.2 


60-136 


BMI 


28.7 ±8.1 


19.2-53.13 


Seated SBP (mm Hg) 


113.7±11.6 


95-140 


Seated DBP (mm Hg) 


74.5 ±7.9 


58-87 


Heart rate (bmp) 


68.7±10.7 


52-100 


Alx 


17.4±12.4 


(-8)-43 


AoPP 


29.8±8.2 


19-50 


AoSP 


104.8±10.5 


87-124 


doi:10.1371/journal.pone.0095124.t001 




Table 2. Characteristics of the hypertensive subjects. 




Parameter 


Value 


Percent/range 


Gender 


Males 


14 


70% 


Females 


6 


30% 


Race 


White 


8 


40% 


African American 


11 


55% 


Hispanic 1 5% 


Age, years 


65.7±11.3 


41-85 


Height (cm) 


171.4±8.4 


155-188 


Weight (kg) 


84.7 ±20.7 


51-118 


BMI 


28.8±6.8 


20-39 


Seated SBP (mm Hg) 


149.0±21.9 


116-194 


Seated DBP (mm Hg) 


80.0±10.5 


63-99 


Heart rate (bpm) 


73.3±12.8 


56-102 


doi:10.1371/journal.pone.0095124.t002 



provides a validated generalized transfer function to convert the 
peripheral radial arterial pulse wave into the equivalent central 
aortic arterial pulse wave, which was used to analyze the derived 
aortic pressure waveform and extract the aortic pulse pressure 
(AoPP) and the augmentation index (Alx) of the pulse, a surrogate 
measure of vascular stiffness [24]. 

Magnetic resonance elastography (MRE) 

MRE is a magnetic resonance imaging (MRI)-based method 
that directly determines organ stiffness and has been validated in a 
study of hypertensive patients [25]. In aortic MRE, vibrations are 
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applied to the abdomen by an electro-mechanical driver, which 
transmits mechanical waves to the aorta. A phase-contrast MRI 
sequence synchronized to externally applied vibrations measures 
the wave displacement field, which is mathematically converted to 
stiffness [26]. 

Isolation of PBMCs 

Blood was collected into a BD Vacutainer K 2 EDTA (BD 
Bioscience, Franklin Lakes, NJ), diluted 1:1 with Hanks balanced 
salt solution (HBSS) (Invitrogen Life Technologies, Grand Island, 
NY), layered onto one volume of Lymphocyte Separation Medium 
(Cellgro Mediatech Inc., Manassas, VA), and centrifuged at 
700 xg for 30 min at room temperature. The mononuclear cells 
were collected, diluted 1:1 with washing buffer (PBS supplemented 
with 2 mM EDTA and 2% FBS) (Gemini Bio-Products, West 
Sacramento, CA), and centrifuged at 300 xg for 7 min. The pellet 
was resuspended in 5 mL of 0.8% ammonium chloride solution 
(STEMCELL Technologies Inc., Vancouver, Canada) for 5 min 
to lyse any remaining erythrocytes. To remove as many platelets as 
possible, the cells were washed two more times as described above 
with two volumes of washing buffer. 

RNA extraction and qRT-PCR 

Total RNA was isolated using the RNeasy Mini Kit (Qiagen, 
Valencia, CA) according to the manufacturer's protocol, tested for 
quality, and stored at — 80°C until use. The VILO kit (Life 
Technologies/Invitrogen, Grand Island, NY) was used to reverse 
transcribe 150 ng of total RNA. Primers (Qiagen) were diluted 
1:20 with molecular-grade water, and 5 uL/well were added to 
384-well plates using a Biomek FX Laboratory Automation 
Workstation (Beckman Coulter, Inc., Brea, CA). The plates were 
left to dry overnight in a sterile hood and stored covered at — 20°C 
until use. qRT-PCR was performed using SYBR Green (Qiagen) 
and a 7900HT Real-Time PCR System (Life Technologies/ 
Applied Biosystems, Foster City, CA) operated in standard mode. 
All of the runs contained a dissociation step. The samples were 
amplified in duplicate in a total volume of 5 |J,L. The results are 
expressed as the relative copy number (RCN), defined as 
RCN = 2" ACq xlOO, where ACq is the difference Cq(target) - 
Cq(reference) [27]. As a reference for normalization, we used the 
median Cq values of three endogenous controls (beta-2 micro- 
globulin, GAPDH and RPL1 3). Data were were uploaded to Gene 
Expression Omnibus (GSE56327). 

Fluorescence RNA in situ hybridization (ISH) 

Cells (up to lxlO 6 ) from eight volunteers were frxed in 
suspension (4% formaldehyde in PBS for 1 h) and deposited onto 
Superfrost Plus microscope slides (Fisher Scientific, Pittsburg, PA) 
by cytospin centrifugation (180 xg, 5 min) in a Cytospin 2 
Shandon Centrifuge (Block Scientific, Inc., Bohemia, NY). The 
slides were dried at 37°C for 1 h and stored in 100% ethanol at 
— 80°C until use. ISH was performed using the QuantiGene 
ViewRNA kit (Affymetrix/Panomics Solutions, Santa Clara, CA) 
according to the manufacturer's protocol with the following probe 
sets: FSHR-FITC, NES-Cy3, and KDR-Cy5, and the nuclei were 
counterstained with DAPI (Sigma-Aldrich, St. Louis, MO). In 
preliminary tests, we found that a 1:1000 dilution of proteinase K 
was optimal. The samples were analyzed with an Olympus 
FBI 000 confocal microscope (Olympus America Inc., Melville, 
NY). The 40x objective and 3x zoom were used to acquire Z- 
stacks from 5-8 fields per sample, for a total of 100-600 cells per 
subject. The acquisition settings were adjusted based on the 
negative control slides (no hybridization probe). The images were 
analyzed with CellProfiler 2.0 software [28] (http:/ /www. 



cellprofiler.org/), as follows, (i) Using FluoView software (Olym- 
pus), all of the images for each Z-stack per each channel were 
exported as .tiff files, (ii) Using MetaMorph software (Molecular 
Devices, Sunnyvale, CA), all of the images from the Z-stacks were 
assembled into a separate .stk file for each channel, (iii) Using 
CellProfiler, we built four pipelines, one for each channel (blue: 
DAPI/nuclei, green: FSHR-FITC, red: NES-Cy3, and cyan: 
KDR-Cy5), which uploaded the respective stacks and generated a 
projection. Steps 1-3 were necessary because the CellProfiler 
software could only recognize MetaMorph stacks, (iv) Next, we 
built a pipeline for the analysis of these projections. An important 
step was the setting of thresholds. To detect nuclei ('primary 
objects'), we entered a valid range of diameters (in pixels), and the 
threshold was set such that >95% of nuclei (including clumped 
nuclei) were detected accurately. Cell boundaries ('secondary 
objects') were estimated by moving 10 pixels outward from the cell 
nucleus (Distance-N method). For RNA quantification, the 
thresholds for each channel were set individually based on 
negative controls. We measured spot numbers, spot intensities, 
and a variety of nuclear parameters (such as intensity, area, and 
texture) offered by the software, (v) Finally, the data were exported 
into Excel and pooled from all of the images/subjects. At this 
point, we manually removed all nuclear objects that were 
inaccurately detected such as clumps that could not be segmented 
into individual nuclei and small fragments that clearly corre- 
sponded to cellular debris based on size (the 'nuclear area' 
parameter). Finally, each cell (and its associated measurements) 
was given a unique ID. All of the subsequent statistical analyses 
and visualizations were performed using various statistical 
packages (vide infra). 

Immunocytochemistry (ICC) 

For ICC, PBMCs from four subjects were fixed in 1% 
formaldehyde in PBS for 1 h, cytospun as described, and used 
immediately. The cells were permeabilized with Triton X-100 
(Fisher Biotech, Pittsburgh, PA) for 10 min at room temperature 
and blocked with 5% BSA + FcR Binding Inhibitor Purified 
(eBioscience, Inc., San Diego, CA) for 30 min at room temper- 
ature. The cells were incubated sequentially with the following 
antibodies: (i) rabbit polyclonal anti-FSHR antibody; (ii) Alexa 
Fluor-594 goat anti-rabbit antibody; and (iii) Alexa Fluor-488 
monoclonal anti-human nestin antibody (Abeam, Cambridge, 
MA), along with Alexa Fluor-647 mouse anti-human CD309 (BD 
Pharmingen, San Jose, CA). DAPI was used as a nuclear stain, and 
the cells were mounted in Fluoromount-G (SouthernBiotech, 
Birmingham, AT). Alexa Fluor-488, Alexa Fluor-647 mouse IgG 1 
isotype, and Alexa Fluor-594 goat anti-rabbit IgG (BD) were used 
as controls. The preparations were imaged with an Olympus FV 
1000 confocal microscope using the 40x objective and 2x zoom, 
and negative controls were used for setting the lasers as described 
for ISH. We acquired 5-6 random images per slide, for a total of 
300-700 cells per subject. In each fraction of cells (including most 
of those that were single- and double-positive for KDR and/or 
FSHR), NES and FSHR were localized to a few, usually 1-4, 
compact circular structures that were intimately associated with 
the nucleus or nuclear grooves (possibly representing spurious 
cross-reactions of our antibodies with either storage or degradation 
compartments). Therefore, we excluded these features from 
quantification. The images were analyzed with the CellProfiler 
software using a different pipeline. To adjust the threshold 
correction factors for each channel, we used the negative controls. 
The data were exported into Excel and further processed as 
described for ViewRNA, step 5. Both pipelines were uploaded on 
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Figure 1. qRT-PCR quantification of gene expression in PBMCs from a sample of a healthy human population. A. The average 
expression levels (indicated as the relative copy number, RCN = 2~ ACq x 1 00) of the tested genes ordered based on the strength of their covariation 
(compare to Fig. 2A). The data are expressed as the means ± SD. Inset. Cq values for housekeeping genes used as endogenous controls. Of note, the 
large SD displayed by CXCR4 was due not to outliers but to the skewness of the data distribution. B. Actual RCN values of the 45 tested genes in 26 
healthy subjects, indicating the coordinated expression of the majority of the genes (conventional color coding). 
doi:1 0.1 371 /journal.pone.00951 24.g001 



the CellProfiler website as VR-3Plex-FNK.cp and ICC-NFK.cp, 
respectively. 

In silico gene expression analysis 

Studies were selected from the National Center for Biotechnol- 
ogy Information's Gene Expression Omnibus (GEO) according to 
the following criteria: (i) they employed Affymetrix Human 
Genome U133 Plus 2.0 Arrays (platform GPL570); (if) they used 
isolated PBMCs; and (iii) they included healthy subjects as 
controls. We chose eight studies (GSE: 8507, 10041, 11761, 
14642, 19743, 21942, 27034, and 46480) and only included the 
GSM files that corresponded to controls in the analysis (a total of 
274 GeneChips), and a study on isolated bone marrow-derived 
CD34 + ceUs (GSE 23025) [29]. All of the .eel files were imported 
into Expression Console (Affymetrix), and we used the MASS 
algorithm for normalization and signal and presence call 
detection. We then extracted the information pertaining to the 



genes of interest. For genes represented by several probe sets on 
the array, we calculated the median signal value for each probe set 
over all the arrays and retained the sets that had the highest value. 
Finally, the 'Present', 'Marginal', and 'Absent' calls were recorded 
for each gene, and the 'Presence Score' was calculated as a 
percentage of all 274 arrays. Because genes with 'Marginal' scores 
were found only on a small number of arrays, all of these were 
considered 'Absent'. 

Pattern analysis and network visualization 

To mine patterns from the gene co-expression matrix data, we 
followed the network mining and merging workflow described by 
Xiang et al. [30] . First, we converted a gene co-expression dataset 
into a unweighted graph by creating an edge between any two 
genes with an absolute correlation value greater than 0.7. After the 
graph was created, we applied the Bron-Kerbosch algorithm [31] 
to generate all of the maximal cliques. We then applied the 
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Figure 2. Regression analysis of genes co-expressed in PBMCs isolated from healthy human adults. A. Correlations between several 
cardiovascular-specific genes. B. Correlations between selected vascular and primitive genes in the same population. C. Correlations between other 
tissue-specific and primitive genes. D. Inverse correlation between the expression of vascular genes and a leukocyte gene (PTPRC/CD45). In all the 
graphs, the number of subjects is n = 26, R 2 is the linear regression coefficient, and p indicates significance; these coefficients were placed in mirror 
positions across the diagonal with their corresponding graphs. The data represent RCNs. 
doi:1 0.1 371 /journal.pone.00951 24.g002 



network merge approach [30] to these cliques under a density 
threshold 0.8, which guaranteed that each resulting sub-network 
induced a sub-matrix with an average correlation value greater 
than 0.8 on the original gene co-expression matrix. Finally, we 
visualized the discovered sub-networks using Gephi [32] (www. 
gephi.org). 



Data analysis 

ANOVA, t-test, Mann-Whitney test, and various correlation 
statistics (Pearson correlation, linear regression, principal compo- 
nent analysis, hierarchical clustering, Cronbach's alpha) were 
performed using JMP 10.0.2 (SAS Institute, Inc., Cary, NC), 
Partek Discovery Suite v. 6.4 (Partek Inc., St. Louis, MO), and 
Microsoft Excel 2010 programs. In all of the statistical analyses, 
p<0.05 was considered significant. 
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Figure 3. Correlation and clustering analysis of data. A. A heat map of the bivariate correlation matrix of gene expression levels representing 
Pearson's correlation coefficient, r, in descending order, beginning with KDR/VEGFR2 (red: positive correlation; blue: negative correlation; gray: no 
correlation). B. The corresponding probability values, p, after Bonferroni correction (red: p<0.05; gray: not significant). C. An unsupervised 
hierarchical clustering analysis (complete linkage on standardized data) representing the associations between the genes as distances (Y-axis). In A 
and C, the corresponding main gene groups are indicated by brackets. 
doi:1 0.1 371 /journal.pone.00951 24.g003 



Results 

Limitations of the microarrays in detecting mRNAs with 
low expression levels 

We first attempted to determine the transcriptional signature of 
CSPCs in blood using the Affymetrix GeneChips, as expression 
levels of a panel of the most common stem/progenitor cell and 
differentiation marker genes (Table SI, with annotations). We 
analyzed a set of 274 microarrays compounded from public 
databases, representing 7 studies of PBMCs and one study of 
purified CD34 stem cells, all from healthy human subjects. Gene 
detection, determined as a 'Present' call by the MAS5 algorithm, 



varied as follows, (i) Other than housekeeping genes (B2M, 
GAPDH, and RPL13), only leukocyte genes (CD14, GD79A, 
CD3E, ITGAM, PECAM1, NT5E/CD73, and PTPRC/CD45) 
and a few others (ACTA2, ALDH1A1, BGLAP/osteocalcin, 
CX3CR1, and CXCR4) had a 'Present' call on more than 90% of 
the arrays, (ii) Several primitive (GATA4, NANOG, NES, NKX2- 
5, POUF5F1/OCT4, and THY1) and differentiation (CAV3, 
CDH5, CNN1, FSHR, KRT14, NOS3, and TEK/TIE2) genes 
were completely undetected (i.e., 0% 'Present' call), (iii) The other 
tested genes had a variable representation on the arrays (Table 
S2). Surprisingly, even the mRNA for CD34, the marker for which 
the cell suspension had been enriched using magnetic immuno- 
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Figure 4. Modular organization of a PBMC gene sub-network. A. Network representation of the genetic covariation. The thickness of each 
connecting line is proportional to the absolute value of the respective Pearson's correlation coefficient. Genes that were significantly correlated with 
the age, Alx, AoPP and BMI of the subjects are encircled (cf. Table 3). Color coding identifies the participation of genes in separate underlying clusters 
[30]. B. Scaling of nodes' clustering coefficient C(k) with their connectivity degrees k, as a signature of hierarchical networks. Note that the data 
spontaneously split into two subpopulations, suggesting distinctly organized modules (for clarity, the leukocyte genes were omitted). Members of 
Module 1 (right), corresponding to the functionally filtered group in A (same color convention), had higher clustering values for similar k values than 
those in Module 2 (left), indicating stronger transcriptional coupling. C. Genes connected to the KDR node. Note that these connections perfectly 
overlap those of Module 1, while PROM1/CD133 serves as a link with Module 2, arguing that KDR is a hub node of Module 1, D. Connections of 
another hub node, NES. The images in A, C and D are based on Pearson correlation coefficients r>|0.8| and were obtained using the software Gephi 
0.8 beta (www.gephi.org). The data shown in B were also obtained using Gephi, based on the network analysis in A. 
doi:1 0.1 371 /journal.pone.00951 24.g004 



selection, was detected on only 80% of the arrays. These findings 
confirmed the low sensitivity and/or poor reliability of the 
GeneChip microarrays for the detection of rare transcripts, as 
previously noted [33]. 



Quantification by qRT-PCR and organization of a PBMC 
transcriptional sub-system 

In contrast, all of the tested mRNAs were detected in PBMCs 
from all of the healthy adults using qRT-PCR (Fig. 1 and Table 
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Table 3. Module 1 gene correlations with physiologic parameters. 



Gene 


Age 




Alx' 11 




AoPP 121 




BMI (females) 


3) 


r <4, 


p-value 


r 


p-value 


r 


p-value 


r 


p-value 


ABCG2 151 


-0.463 


0.01 72 


-0.430 


0.0284 


-0.447 


0.0221 


0.779 


0.0006 


ADIPOQ 


-0.469 


0.0158 


-0.364 


0.0674 


-0.306 


0.1279 


0.541 


0.0371 


ALPL 


-0.420 


0.0327 


-0.462 


0.0176 


-0.398 


0.0442 


0.889 


0.0000 


CAV3 


-0.452 


0.0203 


-0.487 


0.0116 


-0.405 


0.0403 


0.828 


0.0001 


CDH5 


-0.461 


0.0177 


-0.340 


0.0892 


-0.348 


0.0819 


0.830 


0.0001 


CNN1 


-0.475 


0.0142 


-0.490 


0.0110 


-0.469 


0.0157 


0.776 


0.0007 


FSHR 


-0.514 


0.0073 


-0.507 


0.0083 


-0.476 


0.0139 


0.772 


0.0007 


GATA4 


-0.446 


0.0223 


-0.410 


0.0355 


-0.286 


0.1566 


0.597 


0.0187 


KDR 


-0.398 


0.0442 


-0.420 


0.0324 


-0.399 


0.0481 


0.830 


0.0001 


KIT 


-0.451 


0.0209 


-0.470 


0.0155 


-0.414 


0.0356 


0.853 


0.0001 


KRT14 


-0.436 


0.0258 


-0.425 


0.0306 


-0.416 


0.0348 


0.129 


0.5301 


MAP2 


-0.512 


0.0074 


-0.470 


0.0155 


-0.341 


0.0886 


0.834 


0.0001 


MKI67 


-0.479 


0.0132 


-0.375 


0.0592 


-0.232 


0.2548 


0.773 


0.0007 


NANOG 


-0.468 


0.0159 


-0.464 


0.01 69 


-0.451 


0.0209 


0.833 


0.0001 


NES 


-0.519 


0.0065 


-0.507 


0.0083 


-0.413 


0.0359 


0.832 


0.0001 


NKX2-5 


-0.462 


0.0176 


-0.469 


0.0155 


-0.410 


0.0374 


0.845 


0.0001 


NOS3 


-0.460 


0.0180 


-0.527 


0.0057 


-0.520 


0.0065 


0.764 


0.0009 


NOTCH4 


-0.457 


0.0190 


-0.494 


0.0103 


-0.458 


0.0187 


0.738 


0.0017 


OLR1 


-0.526 


0.0058 


-0.479 


0.0133 


-0.406 


0.0396 


0.823 


0.0002 


POU5F1 


-0.493 


0.0104 


-0.478 


0.0136 


-0.462 


0.0175 


0.824 


0.0002 


PR0M1 


-0.489 


0.0112 


-0.398 


0.0438 


-0.259 


0.2012 


0.563 


0.0288 


TEK 


-0.480 


0.0132 


-0.471 


0.0152 


-0.412 


0.0366 


0.766 


0.0009 



' Augmentation index, a surrogate measure of vascular stiffness extracted from radial pulse wave measurements, as described in the Methods. 

(2) Aortic pulse pressure. 

(3) Significant correlations with BMI were found only in females. 

(4) Pearson's correlation coefficient. 

(5) Genes that were significantly correlated with all four parameters are in bold. 
doi:10.1371/joumal.pone.0095124.t003 

SI). Genes with the smallest relative copy numbers (RCNs) 
represented the primitive and tissue-specific (including cardiovas- 
cular) genes, whereas the leukocyte-associated genes had much 
larger RCNs (Fig. 1A). We also observed a pattern suggestive of 
covariation between many of the mRNA markers (Fig. IB). 
Regression analysis demonstrated that indeed the expression levels 
of cardiovascular genes were highly correlated with each other and 
with those of several primitive genes (Fig. 2A and 2B, respectively). 
A strong covariation was demonstrated as well between several 
tissue-associated genes and a different set of primitive markers 
(Fig. 2C). In these comparisons, we also observed poor and/or 
even negative correlations, e.g., between primitive or cardiovas- 
cular genes and leukocyte markers, such as CD45 (Fig. 2D). 

These results were centralized in a covariation matrix (heat- 
map) using the hematopoietic/endothelial progenitor marker 
KDR/VEGFR2 as a reference [1] (Fig. 3A). The associated 
statistics matrix, after adjustment for multiple comparisons, 
revealed that a large majority of the Pearson coefficients were 
also statistically significant (Fig. 3B). The majority of primitive 
genes co-segregated with KDR in a group that contained most of 
the cardiovascular genes (Fig. 3A, B). This information was 
retrieved using unsupervised hierarchical clustering, thus objec- 
tively confirming that the majority of the primitive and cardio- 
vascular genes aggregated together (Fig. 3C, solid bracket), while 
the leukocyte genes were less organized, and the tissue differen- 



tiation markers formed a separate cluster (Fig. 3C, dashed 
bracket). 

Network analysis of gene markers 

Because the known CSPC types are characterized by multiple 
combinations of molecular markers [7], which were not well 
captured by dendrograms due to their linear character, we also 
used a network representation [34]. In this analysis, genes are 
considered nodes connected to other genes via links (or edges), 
based on a preset level of probabilistic significance (in our case, 
Pearson correlation coefficients larger than 0.8 at p<0.05). The 
number of edges between a node and the other genes to which it is 
connected is called its degree k, and the probability of its 
connections with all the other genes in the network is measured by 
a clustering coefficient C(k) [34]. This bottom-up analysis retrieved 
the grouping of marker genes shown in Fig. 3A as tightly 
interconnected gene communities [35] (Fig. 4A). To validate the 
modular properties of these communities, we displayed each 
node's clustering coefficient C(k) as dependent on its degree k. A 
negative relationship on a log-log scale between k and C(k) is 
considered the signature of hierarchical modularity in a network 
[34]. In this respect, our data objectively revealed the existence of 
two distinctly organized modules (Fig. 4B), each composed of the 
same assortment of primitive and differentiation markers as the 
clusters described in Fig. 3. The genes in Module 1 had higher 
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Figure 5. Modular index (Ml) of Module 1 (metagene) associated with the age and cardiovascular parameters of blood donors. A. 

Age-dependent variation in Ml in the population. B. Regression analysis of Ml on Alx. C, D. Correlations of Ml with AoPP and BMI (in women). Ml 
represents the 1 5-gene average of the standardized (mean = 0, SD = 1 ) RCN of each gene within the tested population, +/— SD. n = 26 for A-C and 
n = 14 for D. Note the apparently bimodal distribution of Ml with age in this population. 
doi:1 0.1 371 /journal.pone.00951 24.g005 



values of clustering coefficients C(k) for any given value of their 
degree k compared with those in Module 2 (Fig. 4B). Some genes 
(e.g., CD34, ST3GAL2, and CDH5) had intermediate values of 
connectivity, indicating their position in a transition interface 
between the modules. 

To further confirm the modular nature of this transcriptional 
network, some nodes (called hubs) were directly connected with all 
of the others in the corresponding module and with key nodes in 
the neighboring communities [34]. Based on this criterion, two 
nodes (KDR and NES, Fig. 4C, D) were connected with all of 
their neighbors, and both extended to PROM 1 /CD 133 as a 
common contact node in Module 2, a property that prompted us 
to further investigate these nodes at the single-cell level (see below). 

Cardiovascular functional validation of genes from 
Module 1 

To determine the physiological significance of the genes 
contributing to these co-expression patterns, we studied their 
individual relationships with the following characteristics of the 
blood donors: (i) age; (ii) the augmentation index (Alx) of their 
pulse, a surrogate measure of vascular stiffness [36]; (iii) blood 
pressure parameters (measured using applanation tonometry [37]); 
and (iv) body mass index (BMI). Significant correlations (p<0.05) 
were observed between these physiological variables and a group 
of 15 genes (Table 3), all of which belonged to the same module 
(Module 1, Fig. 3C, bracket; Fig. 4A, encircled; and Fig. 4B). 



These 15 genes were the following (as annotated in Table SI): (i) 
primitive: CD117/KIT, CD338/ABCG2, NANOG, NOTCH4, 
and POU5F1/OCT4; (ii) primitive /endothelial: CD34, KDR/ 
VEGFR2, NES, NOS3/eNOS, OLR1/LOX-1, and TEK/ 
TIE2, as well as ALPL (alkaline phosphatase, also mesenchymal), 
CNN1, and FSHR (neovascular); and (iii) primitive / cardiac: NKX2- 
5 and CAV3. The relationship between the Module 1 genes and 
BMI showed marked differences between the sexes; in females 
only, these genes were positively correlated with BMI. Additional 
Module 1 genes (and a few genes from Module 2) were significant 
in only two or three of the four tests (Table 2) and/ or exhibited 
lower network connectivity at the interface of the two modules 
(Fig. 4B) and were thus not included in the definition of Module 1 . 

Personalized representation of gene analysis 

We sought to consolidate the individual gene expression 
information into a parameter that would combine the contribu- 
tions of all of the 1 5 Module 1 genes that exhibited significance in 
the above correlations with the physiological variables (Table 2). 
For each healthy subject, we calculated a modular index (MI) (also 
known as a metagene) [17], the average of the standardized RCN 
values of the module's genes (Fig. 5). As expected from the 
dependence of individual genes on the respective physiological 
variables, the MI values of all the subjects exhibited negative 
correlations with age (Fig. 5A), vascular stiffness (Fig. 5B), and 
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Figure 6. Personalized representation of gene expression as radial graphs. A. The reference population level (RCN median, n = 26, 1 00%) is 
shown in blue, and the corresponding individual percentage values for the specified genes are shown in red in the order used in Fig. 2A. Note the 
pattern differences in Module 1 genes between females with normal BMIs (upper row) and those with higher BMIs (lower row). B. Association of 
aortic stiffness (left-side images) with radial gene profiles (right-side graphs) of two subjects: left, female, 56 years, average stiffness of 5.25 kPa; right: 
male, 52 years, average stiffness of 6.17 kPa. Aortic stiffness was measured by MRE, as described the Methods section; the local elasticity distribution 
is color coded, as shown in the scale at right (in kPa). 
doi:1 0.1 371 /journal.pone.00951 24.g006 



aortic pulse pressure (Fig. 5C), as well as a positive correlation with 
BMI in females (Fig. 5D). 

Additionally, we investigated whether these 15 genes were 
sufficient for the ability of MI to correctly characterize the 
correlation between the metagene and the respective physiological 
variable (e.g., age, AIx, and blood pressure). Therefore, we 
calculated the Cronbach's alpha coefficient for internal consisten- 
cy, defined as the correlation between different test items (in our 
case, the genes), that determines whether these items collectively 



represent the same general construct [38] . High Cronbach's alpha 
coefficients are generally desirable, but if too high (conventionally 
>0.95), they may indicate item redundancy [38]. For the genes 
contained in the metagene MI, the Cronbach's coefficient was 
indeed >0.95, indicating that this set of variables was not only 
consistent but also saturated. 

For a more detailed representation of gene-specific information, 
we also generated radial diagrams that displayed the individual 
values of analyzed genes as the percentages vs. the median levels in 
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Figure 7. Comparison of gene expression in healthy and hypertensive subjects. A. The median RCN in healthy controls (CT, blue) vs. 
hypertensive patients (HT, red); all of the differences are significant (p<0.05), with the exception of those labeled by arrows. Inset. Cq values of 
housekeeping genes in this patient population. B. Box plots of the aggregated Ml (averages of normalized RCN values of Module 1 genes) in the 
control subjects and patients. (Box plots show the median values, 1 st and 3 rd quartiles, and the interquartile range; symbols are as in A). 
doi:1 0.1 371 /journal.pone.00951 24.g007 



the population. For example, we compared the personalized 
diagrams of female blood donors with normal weights against 
those with BMIs in the overweight range (Fig. 6A). We noted that 
the sector of the diagram containing Module 1 genes was better 
represented in the latter group (in agreement with the positive 
correlation with BMI found for Module 1 genes in women, 
Fig. 5D), while the relative amplitude of Module 2 members 
decreased progressively. 

The benefit of this representation was also illustrated by 
comparing gene expression patterns in subjects with various 
degrees of vascular stiffness, objectively determined by MRE. The 
results showed a reduction of gene expression in both modules 
with increasing aortic stiffness in these otherwise healthy subjects 
(Fig. 6B). 

Modifications of PBMC gene expression in hypertensive 
patients 

The genes from Module 1 had inverse correlations with several 
cardiovascular risk factors (age, blood pressure, and aortic 
stiffness), which suggested that even stronger alterations might be 
found in patients with established vascular pathologies. To test this 
hypothesis, we utilized a cohort of cardiovascular patients treated 
for hypertension in an outpatient cardiovascular clinic (Table 3). 
We found that the expression levels of the tested genes in PBMCs 
from these patients closely paralleled those in healthy subjects 
(Fig. 7A). However, with a few exceptions involving non-significant 
differences (Fig. 7A), the RCNs were reduced compared with the 
healthy population, an observation also captured using their 
metagene MI (Fig. 7B). Covariation heat-maps revealed obvious 
disruptions in the co-expression patterns of these genes in the 
patients compared with the healthy controls (Fig. 8A, B; compare 
with Fig. 3A, B). The hierarchical clustering was also sensitive to 
these rearrangements (Fig. 8C). The network representation 
showed that the connectivity of nodes within and between both 
modules was largely increased (Fig. 8D), as they collapsed in a 
common, highly coupled sub-network (Fig. 8E). Remarkably, the 
range of connectivity degree k values of these module's nodes in 
patients was reduced (23-26, Fig. 8E) compared with that in the 



healthy subjects (17-25, Fig. 4B), arguing for lower connection 
variability among node genes. 

The radial diagrams indicated that in addition to the 
conspicuous reduction in expression levels in both modules 
(Fig. 9A), the patients treated with the diuretic thiazide displayed 
gene levels closer to normal than those taking other medications 
(Fig. 9B). The patients with hypertension exhibited unchanged 
levels of ALPL, a tissue-type alkaline phosphatase gene expressed 
by many non-vascular cells (including osteoblasts [39]) but also by 
vascular lineage CSPCs [40]. The following genes also exhibited 
module-independent variations among patients: (i) ALDH1A1, a 
marker of primitivity with numerous cardiovascular implications 
[41]; (ii) CXCR4, the receptor for SDF-1, which is essential for 
CSPC recruitment [42]; and (iii) ST3GAL2, a transcriptional 
marker of SSEA-4 [43], which has been associated with adult bone 
marrow-derived mesenchymal stem cells [44], and with the 
controversial [45] very small embryonic-like stem cells [46]. 

Cellular origins of gene co-expression 

To identify the origins of gene expression we analyzed mRNA 
and protein expression at the single-cell level. We used cytospun 
PBMCs probed for the Module 1 hub nodes KDR and NES. To 
these, we added the follicle-stimulating hormone receptor (FSHR), 
a marker of neovascular endothelium [47], which was strongly 
correlated with the other two nodes (Fig. 2A and Table SI). For 
mRNA detection, we performed in situ hybridization, followed by 
an automated image analysis. In all of the tested samples, we found 
multiple cells expressing these marker mRNAs in various 
proportions (Fig. 10A, B). To determine whether the Module 1 
covariation pattern was derived from the coordinated presence in 
the circulation of cells separately expressing these genes, we 
performed two-by-two comparisons of the frequencies of single- 
positive cells in the same individuals, which did not show a 
correlation (Fig. 10C-E, insets). Instead, in support of a genuine 
within-cell co-expression mechanism, we found a high propor- 
tionality of the signals from all three mRNAs only within triple- 
positive cells (Fig. 10C-E). 
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Figure 8. Changes in the modular organization of genes in hypertensive patients (n = 20; see Figs. 3 and 4 for details). A. A heat map 
of intergenic covariation. B. The corresponding matrix of significance values following Bonferroni corrections. C. Dendrogram of hierarchical gene 
clustering. D. The network structure of the patient genes, indicating the fusion of Modules 1 and 2 of the network found in the healthy subjects (red). 
E. The relationship between the gene clustering coefficient C(k) and node degree (connectivity) k; the collapsed sub-network shows a very strong and 
nearly uniform connection between nodes (inset), suggestive of transcriptional primitivity. The data analysis and representation were performed as in 
Figs. 3 and 4. 

doi:1 0.1 371 /journal.pone.00951 24.g008 



Given that nuclear morphology is sensitive to a cell's 
transcriptional activity [48], we examined whether nuclear 
parameters obtained by image analysis could distinguish the cells 
co-expressing the Module 1 -related mRNAs form the other 
PBMCs. Indeed, nuclear shape and texture analysis showed that 
the nuclei of cells triple-positive for KDR/NES/FSHR exhibited a 
distinct morphology (Fig. 10F). 

Finally, we searched for the proteins encoded by the mRNAs of 
interest by performing immunocytochemistry on parallel slides 
and found cells expressing all possible combinations of these 
antigens (Fig. 1 1A). The nuclear morphology of the triple-positive 
cells revealed by immunocytochemistry was similar to that 
detected by in situ hybridization, i.e., a larger apparent nuclear 
area and less textured compared with the other cells (Fig. 1 IB), 
thus confirming that the two methods detected the same cellular 
populations. In support of this observation, the frequency 
distributions of the antigen expression classes (i.e., single, double, 
or triple expressers) were also similar to those detected using 



mRNA in situ hybridization in blood samples from the same 
individuals (Fig. 11C). 

Discussion 

Here, we report several novel observations derived from an 
analysis of transcriptional activity in PBMCs: (i) unlike GeneChips, 
which failed to reliably detect approximately half of the gene 
targets, qRT-PCR detected all 45 members of a panel of primitive 
and differentiation marker genes in all the tested human blood 
samples; (ii) based on their strong covariation, the target genes 
segregated into two major clusters, which exhibited the connec- 
tivity properties of modules in a bottom-up reconstituted 
hierarchical transcriptional network; (iii) one of modules contained 
most of the primitive and cardiovascular differentiation markers; 
(iv) this module also correlated with several cardiovascular risk 
factors in the healthy blood donors, contributing to a cardiovas- 
cular-specific metagene; (v) the origin of genetic covariation was 




Figure 9. Radial representation of relative gene expression in hypertensive patients. A. Representative radial diagrams of patients with 
hypertension. Note the overall reduction in expression, with the exception of a subgroup of genes (e.g., ALPL, ITGAM, and PTPRC/CD45; compare 
with Fig. 7A). B. Radial diagrams of hypertensive patients treated with thiazide; in these cases, the gene patterns in the two modules were closer to 
normal (i.e. closer to the blue reference line). 
doi:1 0.1 371 /journal.pone.00951 24.g009 
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Figure 10. Origins of marker gene co-expression within individual cells: in situ hybridization (ISH). A. ISM analysis of the Module 1 hub 
genes KDR and NES and the node gene FSHR, identified by their fluorescent signals in a given microscopic field (brown masks were added to positive 
cells by the CellProfiler image analysis software; blue represents DAPI staining of nuclei). Arrow: a triple-positive cell. B. Four-color confocal images of 
cells that are positive for (a) all three markers; (b) NES and FSHR only; or (c) KDR and NES only (white: KDR; red: NES; green: FSHR; blue: nuclei). Bars: 
5 |im. C-E. Linear regression of the integrated pixel intensity of the mRNA of each marker gene (KDR, NES, and FSHR) detected using ISH in triple- 
positive cells (n = 66 cells pooled from 8 individuals; r: Pearson's correlation coefficient; r 2 : regression coefficient; log-log scale). Inset graphs show the 
lack of correlation between mRNA expression (also measured as the integrated pixel intensity) in single-positive cells for each respective pair of 
markers. F. Nuclear area and several texture features calculated using the CellProfiler analysis significantly separated the triple-positive cells from the 
other cells (*, p<0.05 for single and double expressers vs. negatives; #, p<0.05 for single and double expressers vs. triple positives; p<0.05 for 
triple expressers vs. negatives) (a total of 2094 cells from 8 subjects were analyzed). Abbreviations: Comp., compactness; Int. Int., integrated intensity; 
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Mean Int., mean intensity; Med. Int., median intensity; TDV, texture difference variance; TC, texture contrast; TV, texture variance; TSA, texture sum 

average. The data represent the means of standardized values ± SEM. 

doi:10.1371/journal.pone.0095124.g010 



found within individual cells of a subpopulation with distinctive 
nuclear properties; (vi) the levels of gene expression in both 
modules were significantly reduced in hypertensive patients; and 
(vii) the connectivity within the PBMC gene modules was largely 
amplified in hypertensive patients, leading to the fusion of these 
modules into a common sub-network. 

The properties of these gene modules were consistent with what 
would be expected if they derived from CSPCs; in particular, 
Module 1 could be the signature of circulating endothelial 
progenitor cells. The dependence of Module 1 genes on BMI (in 
women) is in agreement with the finding that CSPCs are under the 
influence of adipose tissue, both as a source of pro-angiogenic 
chemokines (e.g., adiponectin, known to induce the release of 
CSPCs from bone marrow [49]) and as a possible direct originator 
of CSPCs [50]. However, these gene modules were not equated 
with any known underlying cellular classes for the following 
reasons: (i) the stochastic nature of gene expression [51,52]; (ii) cell 
plasticity, manifested as a transcriptionally dependent propensity 



for transdifferentiation [53]; and (iii) the possible horizontal 
redistribution of mRNA among different cell classes, possibly via 
extracellular vesicles acting as intercellular RNA carriers [54], 
which are particularly active among bone marrow stem cells [55]. 
Instead, we consider that although the Module 1 markers are 
primarily expressed by EPCs [7], they also receive contributions 
from other leukocytes with roles in maintaining vascular function, 
such as angiogenic (TEK/TIE2 + ) monocytes [56] or angiogenic T 
cells [57]. Other circulating bone marrow-derived cells of less 
certain nature also contribute to the maintenance of microvascular 
tone and normal blood pressure [58]. 

Equally important, many T lymphocytes positive for CD3 1 and 
CXCR4 were found at the core of the in wfro-formed colonies 
known as 'early' EPCs [59], which are bordered by KDR/ 
VEGFR2 + monocytic cells [60]. Despite their obvious non- 
endothelial nature, the numbers of these heterogeneous, blood- 
derived cell aggregates were highly correlated with the vascular 
function of the blood donors [61]. In a larger context, leukocytes 
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Figure 1 1 . Detection of cells expressing representative Module 1 node proteins using immunocytochemistry (ICC). A. Cells expressing 
various levels of KDR (white), NES (green), and/or FSHR (red). Nuclei are blue (DAPI). a, b: NES-FSHR double-positive cells; c: a triple-positive cell. Bars: 
5 am. B. A nuclear morphology analysis revealed alterations in the triple-positive cells detected using ICC that were comparable to those found using 
ISH (see Fig. 10 for abbreviations). The data represent the means of standardized values ± SEM; a total of 1655 cells were analyzed. C. Frequencies of 
cells positive for the three marker genes, detected using ISH (gray bars) and ICC (black bars) (n = 2094 and 1655 cells, respectively; none of the inter- 
method comparisons were significant, demonstrating that they detected the same cell populations). The data represent the means of individual 
blood donors ± SEM. 
doi:1 0.1 371 /journal.pone.00951 24.g01 1 
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often express other tissue-specific genes, revealing a refined but 
poorly understood transcriptional cross-talk between blood and 
perfused tissues [15]. Thus, we consider that the gene expression 
pattern detected using our method diverges from the simple notion 
of circulating progenitor cells into a more complex underlying 
biological reality that is highly meaningful for vascular function. In 
addition, we assume that the Module 1 genes do not derive from 
circulating adult endothelial cells detached from the vascular 
intima, because the endothelial functional marker von Willebrand 
factor was not present among these genes. The fact that a class of 
cells with distinctive nuclear properties, suggestive of a transcrip- 
tionally active euchromatin, coordinately express the neovascular 
markers KDR, NES and FSHR, provides additional evidence that 
Module 1 genes derive from primitive rather than differentiated 
cells. However, we consider highly unlikely that all the 15 
members of Module 1 to be simultaneously expressed by only one 
cell category, and thus we maintain that the origin of Module 1 is 
an emergent property of the CSPCs system. 

Vascular stiffness determines an individual's susceptibility to 
atherosclerotic plaque formation by predisposing the intimal 
endothelium to increased permeability to lipoproteins and the 
accumulation of monocytes [62]. Therefore, the presence of 
endothelial differentiation genes associated with primitive genes in 
the structure of Module 1 and their inverse correlation with AIx 
indicate a protective role of the respective blood cells in vessel 
health. Of note, AIx did not exhibit a direct relationship with the 
age of the blood donors in our limited population, arguing that the 
dependence of AIx on Module 1 components is not indirectly 
mediated through the effect of age on the expression of these 
components. The decrease in the levels of gene expression in 
PBMCs with an increase in the blood donor's age may be due to 
variations in the number of gene-expressing cells, changes in these 
cells' transcriptional activity (such as the known aging-sensitive 
dependence of transcription on the methylation of CpG islands in 
gene regulatory elements [63]), or both. Additionally, the inverse 
association between Module 1 -derived MI and central aortic pulse 
pressure (AoPP), which was recently shown to predict the future 
development of hypertension in healthy human populations [64], 
suggests a potential role of Module 1 genes in protecting small 
resistance vessels. 

Decreased collective expression of the genes in both modules 
was observed among the patients with established (and treated) 
hypertension. This finding is consistent with the reduction of 
CSPCs, including CD34-positive cells [65], in patients with 
hypertension, possibly as a direct effect of angiotensin [66] or in 
response to anti-hypertensive treatments [67]. Other novel 
observations of this study are that in the hypertensive patients, 
there is increased covariation of the genes in both modules and the 
distinction between these two modules vanishes. The increased 
network connectivity in the patients, despite the overall reduction 
in expression levels, suggests an amplified transcriptional coordi- 
nation. In terms of gene network organization, the observed 
reduced informational heterogeneity is the signature of a less- 
differentiated state of the marker's originating cells, which is 
consistent with the higher transcriptional network entropy of 
primitive cells [68] . In support of this possibility, hematopoiesis is 
amplified in cardiovascular patients, based on blood gene 
expression signatures [17]. 

Among the limitations of this study is the relatively small 
population sample size. Despite this limitation, the correlations 
between genes were strong and significant, highlighting the power 



of our bottom-up network reconstruction method to extract 
meaningful information from small human populations. Admit- 
tedly, the set of genes examined here could still be an incomplete 
representation of the actual underlying module from where our 
target markers were extracted. This topic is worthy of future 
exploration, although we demonstrated that the number of 
components was sufficient for the use of Module 1 in the current 
form as an aggregate biomarker (or metagene) of vascular 
function. Finally, the other module that surfaced from our analysis 
(Module 2 in normal subjects) remains to be explored to determine 
its significance for the repair capacity of blood, which may occur in 
an organ- and/or disease-specific manner. 

In conclusion, the results reported herein constitute the proof of 
concept for a novel bottom-up approach that is more sensitive and 
more accurate than the currently used high-throughput methods 
for the generation of a peripheral blood transcriptional network 
module useful for studying the collective contribution of circulat- 
ing cells to vascular and tissue maintenance and repair. 
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Expression Console). The following GEO datasets were used to 
generate these data: GSE8507, GSE10041, GSE11761, 
GSE14642, GSE19743, GSE21942, GSE27034, and GSE46480 
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